Co-regularized PLSA for Multi-view Clustering
نویسندگان
چکیده
Multi-view data is common in a wide variety of application domains. Properly exploiting the relations among different views is helpful to alleviate the difficulty of a learning problem of interest. To this end, we propose an extended Probabilistic Latent Semantic Analysis (PLSA) model for multi-view clustering, named Co-regularized PLSA (CoPLSA). CoPLSA integrates individual PLSAs in different views by pairwise co-regularization. The central idea behind the co-regularization is that the sample similarities in the topic space from one view should agree with those from another view. An EM-based scheme is employed for parameter estimation, and a local optimal solution is obtained through an iterative process. Extensive experiments are conducted on three realworld datasets and the compared results demonstrate the superiority of our approach.
منابع مشابه
Co-Regularized PLSA for Multi-Modal Learning
Many learning problems in real world applications involve rich datasets comprising multiple information modalities. In this work, we study co-regularized PLSA (coPLSA) as an efficient solution to probabilistic topic analysis of multi-modal data. In coPLSA, similarities between topic compositions of a data entity across different data modalities are measured with divergences between discrete pro...
متن کاملCo-regularized Multi-view Spectral Clustering
In many clustering problems, we have access to multiple views of the data each of which could be individually used for clustering. Exploiting information from multiple views, one can hope to find a clustering that is more accurate than the ones obtained using the individual views. Often these different views admit same underlying clustering of the data, so we can approach this problem by lookin...
متن کاملIntegrating Clustering and Multi-Document Summarization by Bi-Mixture Probabilistic Latent Semantic Analysis (PLSA) with Sentence Bases
Probabilistic Latent Semantic Analysis (PLSA) has been popularly used in document analysis. However, as it is currently formulated, PLSA strictly requires the number of word latent classes to be equal to the number of document latent classes. In this paper, we propose Bi-mixture PLSA, a new formulation of PLSA that allows the number of latent word classes to be different from the number of late...
متن کاملDocument Clustering in a Learned Concept Space
Document clustering is one of the fundamental techniques of unsupervised learning from unstructured textual data which constitutes a real saving in terms of efficiency for various information retrieval (IR) tasks. The clustering results are not only used as basic information for the structure of a collection, but also as a preceding step before conducting other IR applications. On the other han...
متن کاملMulti-View Clustering via Joint Nonnegative Matrix Factorization
Many real-world datasets are comprised of different representations or views which often provide information complementary to each other. To integrate information from multiple views in the unsupervised setting, multiview clustering algorithms have been developed to cluster multiple views simultaneously to derive a solution which uncovers the common latent structure shared by multiple views. In...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012